Search CORE

97 research outputs found

Modeling temporal dimensions of semistructured data

Author: Barbara Oliboni
Carlo Combi
Elisa Quintarelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper we propose an approach to manage in a correct way valid time semantics for semistructured temporal clinical information. In particular, we use a graph-based data model to represent radiological clinical data, focusing on the patient model of the well known DICOM standard, and define the set of (graphical) constraints needed to guarantee that the history of the given application domain is consistent

Archivio istituzionale della ricerca - Politecnico di Milano

Catalogo dei prodotti della ricerca

Tracking Data Provenance of Archaeological Temporal Information in Presence of Uncertainty

Author: Belussi Alberto
Migliorini Sara
Quintarelli Elisa
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

The interpretation process is one of the main tasks performed by archaeologists who, starting from ground data about evidences and findings, incrementally derive knowledge about ancient objects or events. Very often more than one archaeologist contributes in different time instants to discover details about the same finding and thus, it is important to keep track of history and provenance of the overall knowledge discovery process. To this aim, we propose a model and a set of derivation rules for tracking and refining data provenance during the archaeological interpretation process. In particular, among all the possible interpretation activities, we concentrate on the one concerning the dating that archaeologists perform to assign one or more time intervals to a finding to define its lifespan on the temporal axis. In this context, we propose a framework to represent and derive updated provenance data about temporal information after the mentioned derivation process. Archaeological data, and in particular their temporal dimension, are typically vague, since many different interpretations can coexist, thus, we will use Fuzzy Logic to assign a degree of confidence to values and Fuzzy Temporal Constraint Networks to model relationships between dating of different findings represented as a graph-based dataset. The derivation rules used to infer more precise temporal intervals are enriched to manage also provenance information and their following updates after a derivation step. A MapReduce version of the path consistency algorithm is also proposed to improve the efficiency of the refining process on big graph-based datasets

Catalogo dei prodotti della ricerca

Operational and abstract semantics of the query language G-Log

Author: Agostino Cortesi
Agostino Dovier
Elisa Quintarelli
Letizia Tanca
Publication venue
Publication date: 01/01/2002
Field of study

The amount and variety of data available electronically have dramatically increased in the led decade; however, data and documents are stored in different ways and do notusual# show their internal structure. In order to take ful advantage of thetopolk9dQ# structure ofdigital documents, andparticulIII web sites, theirhierarchical organizationshouliz explizatio introducing a notion of querysimil; to the one usedin database systems. A good approach, in that respect, is the one provided bygraphical querylrydM#99; original; designed to model object bases and lndd proposed for semistructured data, la, G-Log. The aim of this paper is to providesuitabl graph-basedsemantics to thislisd;BI# supporting both data structure variabil#I andtopol#Ik;M similpol#I between queries and document structures. A suite ofoperational semantics basedon the notion ofbisimulQM#I is introduced both at theconcr--h level (instances) andat theabstru( level (schemata), giving rise to a semantic framework that benefits from the cross-fertil9;dl of tool originalM designed in quite different research areas (databases, concurrency,loncur static analysis)

Archivio Ricerca Ca'Foscari

CiteSeerX

Archivio istituzionale della ricerca - Politecnico di Milano

Elsevier - Publisher Connector

Archivio istituzionale della ricerca - Università degli Studi di Udine

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Semi-automatic support for evolving functional dependencies

Author: Mazuran Mirjana
Quintarelli Elisa
Tanca Letizia
Ugolini Stefania
Publication venue: Springer Verlag
Publication date: 01/01/2016
Field of study

During the life of a database, systematic and frequent violations of a given constraint may suggest that the represented reality is changing and thus the constraint should evolve with it. In this paper we propose a method and a tool to (i) find the functional dependencies that are violated by the current data, and (ii) support their evolution when it is necessary to update them. The method relies on the use of confidence, as a measure that is associated with each dependency and allows us to understand \u201dhow far\u201d the dependency is from correctly describing the current data; and of goodness, as a measure of balance between the data satisfying the antecedent of the dependency and those satisfying its consequent. Our method compares favorably with literature that approaches the same problem in a different way, and performs effectively and efficiently as shown by our tests on both real and synthetic databases

Archivio istituzionale della ricerca - Politecnico di Milano

Catalogo dei prodotti della ricerca

A graph-based meta-model for heterogeneous data management

Author: Damiani Ernesto
Oliboni Barbara
Quintarelli Elisa
Tanca Letizia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The wave of interest in data-centric applications has spawned a high variety of data models, making it extremely difficult to evaluate, integrate or access them in a uniform way. Moreover, many recent models are too specific to allow immediate comparison with the others and do not easily support incremental model design. In this paper, we introduce GSMM, a meta-model based on the use of a generic graph that can be instantiated to a concrete data model by simply providing values for a restricted set of parameters and some high-level constraints, themselves represented as graphs. In GSMM, the concept of data schema is replaced by that of constraint, which allows the designer to impose structural restrictions on data in a very flexible way. GSMM includes GSL, a graph-based language for expressing queries and constraints that besides being applicable to data represented in GSMM, in principle, can be specialised and used for existing models where no language was defined. We show some sample applications of GSMM for deriving and comparing classical data models like the relational model, plain XML data, XML Schema, and time-varying semistructured data. We also show how GSMM can represent more recent modelling proposals: the triple stores, the BigTable model and Neo4j, a graph-based model for NoSQL data. A prototype showing the potential of the approach is also described

Catalogo dei prodotti della ricerca

CoPart: a context-based partitioning technique for big data

Author: Belussi Alberto
Carra Damiano
Migliorini Sara
Quintarelli Elisa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

The MapReduce programming paradigm is frequently used in order to process and analyse a huge amount of data. This paradigm relies on the ability to apply the same operation in parallel on independent chunks of data. The consequence is that the overall performances greatly depend on the way data are partitioned among the various computation nodes. The default partitioning technique, provided by systems like Hadoop or Spark, basically performs a random subdivision of the input records, without considering the nature and correlation between them. Even if such approach can be appropriate in the simplest case where all the input records have to be always analyzed, it becomes a limit for sophisticated analyses, in which correlations between records can be exploited to preliminarily prune unnecessary computations. In this paper we design a context-based multi-dimensional partitioning technique, called COPART, which takes care of data correlation in order to determine how records are subdivided between splits (i.e., units of work assigned to a computation node). More specifically, it considers not only the correlation of data w.r.t. contextual attributes, but also the distribution of each contextual dimension in the dataset. We experimentally compare our approach with existing ones, considering both quality criteria and the query execution times

Catalogo dei prodotti della ricerca

Tracking social provenance in chains of retweets

Author: Belussi Alberto
Gambini Mauro
Migliorini Sara
Quintarelli Elisa
Publication venue
Publication date: 01/01/2023
Field of study

In the era of massive sharing of information, the term social provenance is used to denote the ownership, source or origin of a piece of information which has been propagated through social media. Tracking the provenance of information is becoming increasingly important as social platforms acquire more relevance as source of news. In this scenario, Twitter is considered one of the most important social networks for information sharing and dissemination which can be accelerated through the use of retweets and quotes. However, the Twitter API does not provide a complete tracking of the retweet chains, since only the connection between a retweet and the original post is stored, while all the intermediate connections are lost. This can limit the ability to track the diffusion of information as well as the estimation of the importance of specific users, who can rapidly become influencers, in the news dissemination. This paper proposes an innovative approach for rebuilding the possible chains of retweets and also providing an estimation of the contributions given by each user in the information spread. For this purpose, we define the concept of Provenance Constraint Network and a modified version of the Path Consistency Algorithm. An application of the proposed technique to a real-world dataset is presented at the end of the paper

Catalogo dei prodotti della ricerca

A Context-Aware Recommendation System with a Crowding Forecaster

Author: Alberto Belussi
Anna Dalla Vecchia
Elisa Quintarelli
Sara Migliorini
Publication venue
Publication date: 01/01/2023
Field of study

Recommendation systems (RSs) are increasing their popularity in recent years. Many big IT companies like Google, Amazon and Netflix, have a RS at the core of their business. In this paper, we propose a modular platform for enhancing a RS for the tourism domain with a crowding forecaster, which is able to produce an estimation about the current and future occupation of different Points of Interest (PoIs) by taking into consideration also contextual aspects. The main advantage of the proposed system is its modularity and the ability to be easily tailored to different application domains. Moreover, the use of standard and pluggable components allows the system to be integrated in different application scenarios

Catalogo dei prodotti della ricerca

Context Modeling and Context Awareness: steps forward in the Context-ADDICT project

Author: Cristiana Bolchini
Elisa Quintarelli
Fabio A. Schreiber
Giorgio Orsi
Letizia Tanca
Publication venue
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Database challenges for exploratory computing

Author: Buoncristiano Marcello
Mecca Giansalvatore
Quintarelli Elisa
Roveri Manuel
Santoro Donatello
Tanca Letizia
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Helping users to make sense of very big datasets is nowadays considered an important research topic. However, the tools that are available for data analysis purposes typically address professional data scientists, who, besides a deep knowledge of the domain of interest, master one or more of the following disciplines: mathematics, statistics, computer science, computer engineering, and programming. On the contrary, in our vision it is vital to support also different kinds of users who, for various reasons, may want to analyze the data and obtain new insight from them. Examples of these data enthusiasts [4, 9] are journalists, investors, or politicians: non-technical users who can draw great advantage from exploring the data, achieving new and essential knowledge, instead of reading query results with tons of records. The term data exploration generally refers to a data user being able to find her way through large amounts of data in order to gather the necessary information. A more technical definition comes from the field of statistics, introduced by Tukey [12]: with exploratory data analysis the researcher explores the data in many possible ways, including the use of graphical tools like boxplots or histograms, gaining knowledge from the way data are displayed. Despite the emphasis on visualization, exploratory data analysis still assumes that the user understands at least the basics of statistics, while in this paper we propose a paradigm for database exploration which is in turn inspired by the exploratory computing vision [2]. We may describe exploratory computing as the step-by-step “conversation” of a user and a system that “help each other” to refine the data exploration process, ultimately gathering new knowledge that concretely fullfils the user needs. The process is seen as a conversation since the system provides active support: it not only answers user’s requests, but also suggests one or more possible actions that may help the user to focus the exploratory session. This activity may entail the use of a wide range of different techniques, including the use of statistics and data analysis, query suggestion, advanced visualization tools, etc. The closest analogy [2] is that of a human-tohuman dialogue, in which two people talk, and continuously make reference to their lives, priorities, knowledge and beliefs, leveraging them in order to provide the best possible contribution to the dialogue. In essence, through the conversation they are exploring themselves as well as the information that is conveyed through their words. This exploration process therefore means investigation, exploration-seeking, comparison-making, and learning altogether. It is most appropriate for big collections of semantically rich data, which typically hide precious knowledge behind their complexity. In this broad and innovative context, this paper intends to make a significant step further: it proposes a model to concretely perform this kind of exploration over a database. The model is general enough to encompass most data models and query languages that have been proposed for data management in the last few years. At the same time, it is precise enough to provide a first formalization of the problem and reason about the research challenges posed to database researchers by this new paradigm of interaction

CiteSeerX

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio della Ricerca - Università della Basilicata

Catalogo dei prodotti della ricerca